372 research outputs found
Understanding High Dimensional Spaces through Visual Means Employing Multidimensional Projections
Data visualisation helps understanding data represented by multiple
variables, also called features, stored in a large matrix where individuals are
stored in lines and variable values in columns. These data structures are
frequently called multidimensional spaces.In this paper, we illustrate ways of
employing the visual results of multidimensional projection algorithms to
understand and fine-tune the parameters of their mathematical framework. Some
of the common mathematical common to these approaches are Laplacian matrices,
Euclidian distance, Cosine distance, and statistical methods such as
Kullback-Leibler divergence, employed to fit probability distributions and
reduce dimensions. Two of the relevant algorithms in the data visualisation
field are t-distributed stochastic neighbourhood embedding (t-SNE) and
Least-Square Projection (LSP). These algorithms can be used to understand
several ranges of mathematical functions including their impact on datasets. In
this article, mathematical parameters of underlying techniques such as
Principal Component Analysis (PCA) behind t-SNE and mesh reconstruction methods
behind LSP are adjusted to reflect the properties afforded by the mathematical
formulation. The results, supported by illustrative methods of the processes of
LSP and t-SNE, are meant to inspire students in understanding the mathematics
behind such methods, in order to apply them in effective data analysis tasks in
multiple applications
Recommended from our members
Together We Are Better : Professional Learning Networks for Teachers
In recent years, many educators have turned to professional learning networks (PLNs) to grow in their craft with peers who are more accessible online because of reduced temporal and spatial constraints. While educators have cultivated PLNs, there is a dearth of research about the effects of PLNs. This manuscript reports the findings of a qualitative study that investigated PLN experiences through the analysis of survey data from 732 P-12 teachers. Data analysis suggests that the anytime, anywhere availability of expansive PLNs, and their capacity to respond to educators\u27 diverse interests and needs, appear to offer possibilities for supporting the professional growth of whole teachers. These findings have implications for defining the present and future of teacher learning in a digital age
LDPP at the FinNLP-2022 ERAI task: Determinantal point processes and variational auto-encoders for identifying high-quality opinions from a pool of social media posts
Social media and online forums have made it easier for people to share their views and opinions on various topics in society. In this paper, we focus on posts discussing investment related topics. When it comes to investment , people can now easily share their opinions about online traded items and also provide rationales to support their arguments on social media. However, there are millions of posts to read with potential of having some posts from amateur investors or completely unrelated posts. Identifying the most important posts that could lead to higher maximal potential profit (MPP) and lower maximal loss for investment is not a trivial task. In this paper, propose to use determinantal point processes and variational autoencoders to identify high quality posts from the given rationales. Experimental results suggest that our method mines quality posts compared to random selection and also latent variable modeling improves improves the quality of selected posts
Bayes at FigLang 2022 Euphemism detection shared task: Cost-sensitive Bayesian fine-tuning and Venn-Abers predictors for robust training under class skewed distributions
Transformers have achieved a state of the art performance across most natural language processing tasks. However the performance of these models degrade when being trained on skewed class distributions (class imbalance) because training tends to be biased towards head classes with most of the data points . Classical methods that have been proposed to handle this problem (re-sampling and re-weighting) often suffer from unstable performance, poor applicability and poor calibration. In this paper, we propose to use Bayesian methods and Venn-Abers predictors for well calibrated and robust training against class imbalance. Our proposed approach improves f1-score of the baseline RoBERTa (A Robustly Optimized Bidirectional Embedding from Transformers Pretraining Approach) model by about 6 points (79.0% against 72.6%) when training with class imbalanced data
Disadvantaged by degrees? How Widening Participation students are not only hindered in accessing HE, but also during – and after – university.
There is no shortage of literature addressing the range of reasons why more disadvantaged groups are underrepresented in higher education – and particularly elite universities – in the UK, and it is clear that this has little to do with any real deficiency in terms of ability. This paper begins with an overview of this issue but then extends the argument beyond widening participation at the point of access. It raises concerns emerging from two relatively underresearched areas in the literature which indicate that ‘widening participation’ – WP – students are faced with greater inequalities than their more affluent peers both during their undergraduate degrees as well as beyond them. Although the focus here is on the UK, this topic and many of its themes will be familiar to educationalists and HE practitioners in other countries
GGNN@Causal News Corpus 2022: Gated graph neural networks for causal event classification from social-political news articles
The discovery of causality mentions from text is a core cognitive concept and appears in many natural language processing (NLP) applications. In this paper, we study the task of Event Causality Identification (ECI) from social-political news. The aim of the task is to detect causal relationships between event mention pairs in text. Although deep learning models have recently achieved a state-of-the-art performance on many tasks and applications in NLP, most of them still fail to capture rich semantic and syntactic structures within sentences which is key for causality classification. We present a solution for causal event detection from social-political news that captures semantic and syntactic information based on gated graph neural networks (GGNN) and contextualized language embeddings. Experimental results show that our proposed method outperforms the baseline model (BERT (Bidirectional Embeddings from Transformers) in terms of f1-score and accuracy
Recommended from our members
Implementation of and Early Outcomes From Anal Cancer Screening at a Community-Engaged Health Care Facility Providing Care to Nigerian Men Who Have Sex With Men.
PurposeAnal cancer risk is substantially higher among HIV-infected men who have sex with men (MSM) as compared with other reproductive-age adults, but screening is rare across sub-Saharan Africa. We report the use of high-resolution anoscopy (HRA) as a first-line screening tool and the resulting early outcomes among MSM in Abuja, Nigeria.MethodsFrom August 2016 to August 2017, 424 MSM enrolled in an anal cancer screening substudy of TRUST/RV368, a combined HIV prevention and treatment cohort. HRA-directed biopsies were diagnosed by histology, and ablative treatment was offered for high-grade squamous intraepithelial lesions (HSIL). HRA proficiency was assessed by evaluating the detection of squamous intraepithelial lesions (SIL) over time and the proportion biopsied. Prevalence estimates of low-grade squamous intraepithelial lesions and HSIL with 95% CIs were calculated. Multinomial logistic regression was used to identify those at the highest risk of SIL.ResultsMedian age was 25 years (interquartile range [IQR], 22-29), median time since sexual debut was 8 years (IQR, 4-12), and 59% (95% CI, 54.2% to 63.6%) were HIV infected. Rate of detection of any SIL stabilized after 200 screenings, and less than 20% had two or more biopsies. Preliminary prevalence estimates of low-grade squamous intraepithelial lesions and HSIL were 50.0% (95% CI, 44.7% to 55.3%) and 6.3% (95% CI, 4.0% to 9.3%). HIV infection, at least 8 years since anal coital debut, concurrency, and external warts were independently statistically associated with SIL.ConclusionProficiency with HRA increased with experience over time. However, HSIL detection rates were low, potentially affected by obstructed views from internal warts and low biopsy rates, highlighting the need for ongoing evaluation and mentoring to validate this finding. HRA is a feasible first-line screening tool at an MSM-friendly health care facility. Years since anal coital debut and external warts could prioritize screening
UCCNLP@SMM4H’22:Label distribution aware long-tailed learning with post-hoc posterior calibration applied to text classification
The paper describes our submissions for the Social Media Mining for Health (SMM4H) workshop 2022 shared tasks. We participated in 2 tasks: (1) classification of adverse drug events (ADE) mentions in english tweets (Task-1a) and (2) classification of self-reported intimate partner violence (IPV) on twitter (Task 7). We proposed an approach that uses RoBERTa (A Robustly Optimized BERT Pretraining Approach) fine-tuned with a label distribution-aware margin loss function and post-hoc posterior calibration for robust inference against class imbalance. We achieved a 4% and 1 % increase in performance on IPV and ADE respectively when compared with the traditional fine-tuning strategy with unweighted cross-entropy loss
UNLPSat TextGraphs-16 Natural Language Premise Selection task: Unsupervised Natural Language Premise Selection in mathematical text using sentence-MPNet
This paper describes our system for the submission to the TextGraphs 2022 shared task at COLING 2022: Natural Language Premise Selection (NLPS) from mathematical texts. The task of NLPS is about selecting mathematical statements called premises in a knowledge base written in natural language and mathematical formulae that are most likely to be used to prove a particular mathematical proof. We formulated this task as an unsupervised semantic similarity task by first obtaining contextualized embeddings of both the premises and mathematical proofs using sentence transformers. We then obtained the cosine similarity between the embeddings of premises and proofs and then selected premises with the highest cosine scores as the most probable. Our system improves over the baseline system that uses bag of words models based on term frequency inverse document frequency in terms of mean average precision (MAP) by about 23.5% (0.1516 versus 0.1228)
Perspective Chapter: Understanding Thermal Maturity Evolution and Hydrocarbon Cracking – Implication for Cretaceous Awgu and Nkporo Shales, Southeastern Nigeria
One-dimensional basin modeling was carried out using Schlumberger’s PetroMod modeling software that provided understanding on the thermal evolution, timing of hydrocarbon generation and expulsion of the Coniacian Awgu Shale and the Campanian Nkporo Shale penetrated by Nzam-1 and Akukwa-2 wells in the lower Benue Trough, Nigeria. The burial temperature and vitrinite reflectance values ranged from 30 to 145°C and 0.5 to 2.9%Ro for Awgu Formation, 28 to 125°C and 0.5 to 1.5%Ro for Nkporo Formation in Nzam-1 well model; 29.5 to 145°C and 0.8 to 2.4%Ro for Awgu Formation, and 28.5 to 95°C and 0.6 to 0.8%Ro for Nkporo Formation in Akukwa-2 well model. Awgu Shale reached the required threshold of the oil generation window during mid Campanian (75Ma) and late Santonian (82Ma) in Nzam-1 and Akukwa-2 well models, respectively. Nkporo Shale entered the required oil window threshold during early Paleocene (65Ma) in Nzam-1 well model and late Maastrichtian (67Ma) in Akukwa-2 well model. This study revealed that valid petroleum system elements exist in Anambra basin, and some amount of gaseous hydrocarbons and little oil may have been generated and expelled. Exponential decrease in temperature over time has favored the preservation of the gas reservoirs and the survival of hydrocarbons in the deep strata. The early maturity of Nkporo Shale can be attributed to lack of the requisite burial depth, temperature and pressure in favor of oil generation and expulsion. Post-maturity status of Awgu Shales may be associated with deeper burial depth and possibly due to the effect of Santonian tectonic episode
- …